Conference Proceedings
SynthNet: Learning to synthesize music end-to-end
F Schimbinschi, C Walder, SM Erfani, J Bailey
IJCAI International Joint Conference on Artificial Intelligence | IJCAI | Published : 2019
Open access
Abstract
We consider the problem of learning a mapping directly from annotated music to waveforms, bypassing traditional single note synthesis. We propose a specific architecture based on WaveNet, a convolutional autoregressive generative model designed for text to speech. We investigate the representations learned by these models on music and conclude that mappings between musical notes and the instrument timbre can be learned directly from the raw audio coupled with the musical score, in binary piano roll format. Our model requires minimal training data (9 minutes), is substantially better in quality and converges 6 times faster in comparison to strong baselines in the form of powerful text to spee..
View full abstractGrants
Awarded by Australian Research Council
Funding Acknowledgements
This research was supported by ARC Discovery Grant DP170102472, Data61 CSIRO and sponsored by NVIDIA.